稀疏非线性函数型可加模型的变量选择

doi:10.19343/j.cnki.11-1302 /c.2021.05.009

统计研究 ›› 2021, Vol. 38 ›› Issue (5): 109-120.doi: 10.19343/j.cnki.11-1302 /c.2021.05.009

稀疏非线性函数型可加模型的变量选择

白永昕田茂再

出版日期:2021-05-25 发布日期:2021-05-25

Variable Selection for Sparse Nonlinear Functional Additive Model

Bai Yongxin Tian Maozai

Online:2021-05-25 Published:2021-05-25

1. 稀疏非线性函数型可加模型的变量选择（附件）.pdf(1053KB)

摘要/Abstract

摘要： 本文研究了响应变量和协变量均为函数型数据的非线性可加模型的变量选择问题。首先,基于函数型距离相关系数,本文构造了一个F检验统计量对协变量和残差的函数型距离相关系数进行排序并对最大相关系数所对应的协变量与残差进行独立性 F 检验,选择满足条件的新变量纳入到模型。其次,对每个新变量纳入模型后的贡献进行评估,从而确认新变量最终是否应该纳入模型。这种变量选择方法通过不依赖模型的方法选择候选变量,将变量选择和模型估计分开,可以降低回归中协变量的维度。同时,在迭代过程中利用残差可以获取模型的相关信息,从而提高变量选择的准确度。最?后,本文通过模拟研究对所提变量选择方法的表现进行评价,并进一步通过一个家电能耗数据来验证所提的方法。

关键词: 函数型响应变量, 非线性可加模型, 变量选择

Abstract:

In this paper, we consider variable selection for the nonlinear additive model whose response variable and covariate are both functional data. First, we construct an F test statistic based on the functional distance correlation and sort the functional distance correlation coefficients of the covariates and residuals. Also, the independent F test is conducted for the covariates and residuals corresponding to the maximum correlation coefficients, and the qualified new variables are included in the model. Second, the contribution of each new variable in the model is evaluated to determine whether the new variable should eventually be included in the model. The procedure separates the selection process from the model estimation by picking the candidates using a model-free measurement, which can reduce the dimension of the covariates in the regression. At the same time, the residual error can be used to obtain the relevant information of the model during the iteration process to improve the accuracy of variable selection. Finally, the performance of the proposed variable selection procedure is assessed with Monte Carlo simulation studies. We further demonstrate the proposed procedure with a dataset of the energy consumption of appliances.

Key words: Functional Response Variables, Nonlinear Additive Model, Variable Selection

白永昕田茂再. 稀疏非线性函数型可加模型的变量选择[J]. 统计研究, 2021, 38(5): 109-120.

Bai Yongxin Tian Maozai. Variable Selection for Sparse Nonlinear Functional Additive Model[J]. Statistical Research, 2021, 38(5): 109-120.

[1]	闫懋博田茂再. 基于随机化适应性Lasso的高维变量选择[J]. 统计研究, 2021, 38(1): 147-160.
[2]	史兴杰等. 高维数据的稳健二分类方法[J]. 统计研究, 2020, 37(9): 95-105.
[3]	赵为华等. 有序响应变量的贝叶斯模型选择及其在COPD疾病防治中的应用[J]. 统计研究, 2020, 37(3): 85-93.
[4]	胡亚南田茂再. 零膨胀计数数据的联合建模及变量选择 [J]. 统计研究, 2019, 36(1): 104-114.
[5]	方匡南杨阳. SGL-SVM方法研究及其在财务困境预测中的应用[J]. 统计研究, 2018, 35(8): 104-115.
[6]	吴翌琳李宪. 劳动力市场匹配效率的影响因素研究[J]. 统计研究, 2018, 35(5): 110-118.
[7]	张元庆陶志鹏. 广义嵌套空间模型变量选择研究——基于广义空间信息准则[J]. 统计研究, 2017, 34(9): 100-.
[8]	李仲达等. 非连续型高维阈值回归理论：稀疏建模与推断[J]. 统计研究, 2017, 34(4): 89-100.
[9]	斯介生等. 基于异质性数据的Logit变量选择模型研究[J]. 统计研究, 2017, 34(12): 110-118.
[10]	林存洁李扬 . 大数据分析仍需要统计思想——以ARGO模型为例[J]. 统计研究, 2016, 33(11): 109-112.
[11]	陈心洁等. 线性混合效应模型的FIC选择准则[J]. 统计研究, 2015, 32(3): 100-103.
[12]	马双鸽等. 大数据的整合分析方法[J]. 统计研究, 2015, 32(11): 3-11.
[13]	王小燕等. Logistic回归的双层变量选择研究[J]. 统计研究, 2014, 31(9): 107-112.
[14]	李扬曾宪斌. 面板数据模型的惩罚似然变量选择方法研究[J]. 统计研究, 2014, 31(3): 83-89.
[15]	王康宁汪四水. 基于Bayes后验概率的自变量与异常点的同时识别[J]. 统计研究, 2012, 29(1): 31-37.

稀疏非线性函数型可加模型的变量选择

Variable Selection for Sparse Nonlinear Functional Additive Model

赞

补充材料

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

Metrics

本文评价

推荐阅读 10